different type
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
Unstructured pruning reduces the memory footprint in deep neural networks (DNNs). Recently, researchers proposed different types of structural pruning intending to reduce also the computation complexity. In this work, we first suggest a new measure called mask-diversity which correlates with the expected accuracy of the different types of structural pruning. We focus on the recently suggested N:M fine-grained block sparsity mask, in which for each block of M weights, we have at least N zeros. While N:M fine-grained block sparsity allows acceleration in actual modern hardware, it can be used only to accelerate the inference phase.
Uncertainty Aware Semi-Supervised Learning on Graph Data
Thanks to graph neural networks (GNNs), semi-supervised node classification has shown the state-of-the-art performance in graph data. However, GNNs have not considered different types of uncertainties associated with class probabilities to minimize risk of increasing misclassification under uncertainty in real life. In this work, we propose a multi-source uncertainty framework using a GNN that reflects various types of predictive uncertainties in both deep learning and belief/evidence theory domains for node classification predictions. By collecting evidence from the given labels of training nodes, the Graph-based Kernel Dirichlet distribution Estimation (GKDE) method is designed for accurately predicting node-level Dirichlet distributions and detecting out-of-distribution (OOD) nodes. We validated the outperformance of our proposed model compared to the state-of-the-art counterparts in terms of misclassification detection and OOD detection based on six real network datasets. We found that dissonance-based detection yielded the best results on misclassification detection while vacuity-based detection was the best for OOD detection. To clarify the reasons behind the results, we provided the theoretical proof that explains the relationships between different types of uncertainties considered in this work.
Quantifying Uncertainty in Machine Learning-Based Pervasive Systems: Application to Human Activity Recognition
Balditsyn, Vladimir, Lalanda, Philippe, Vega, German, Chollet, Stéphanie
The recent convergence of pervasive computing and machine learning has given rise to numerous services, impacting almost all areas of economic and social activity. However, the use of AI techniques precludes certain standard software development practices, which emphasize rigorous testing to ensure the elimination of all bugs and adherence to well-defined specifications. ML models are trained on numerous high-dimensional examples rather than being manually coded. Consequently, the boundaries of their operating range are uncertain, and they cannot guarantee absolute error-free performance. In this paper, we propose to quantify uncertainty in ML-based systems. To achieve this, we propose to adapt and jointly utilize a set of selected techniques to evaluate the relevance of model predictions at runtime. We apply and evaluate these proposals in the highly heterogeneous and evolving domain of Human Activity Recognition (HAR). The results presented demonstrate the relevance of the approach, and we discuss in detail the assistance provided to domain experts.
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- Asia > Middle East > Jordan (0.04)
Understanding Privacy Risks in Code Models Through Training Dynamics: A Causal Approach
Yang, Hua, Velasco, Alejandro, Fang, Sen, Xu, Bowen, Poshyvanyk, Denys
Large language models for code (LLM4Code) have greatly improved developer productivity but also raise privacy concerns due to their reliance on open-source repositories containing abundant personally identifiable information (PII). Prior work shows that commercial models can reproduce sensitive PII, yet existing studies largely treat PII as a single category and overlook the heterogeneous risks among different types. We investigate whether distinct PII types vary in their likelihood of being learned and leaked by LLM4Code, and whether this relationship is causal. Our methodology includes building a dataset with diverse PII types, fine-tuning representative models of different scales, computing training dynamics on real PII data, and formulating a structural causal model to estimate the causal effect of learnability on leakage. Results show that leakage risks differ substantially across PII types and correlate with their training dynamics: easy-to-learn instances such as IP addresses exhibit higher leakage, while harder types such as keys and passwords leak less frequently. Ambiguous types show mixed behaviors. This work provides the first causal evidence that leakage risks are type-dependent and offers guidance for developing type-aware and learnability-aware defenses for LLM4Code.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > North Carolina (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective
Zhang, Shimao, Lai, Zhejian, Liu, Xiang, She, Shuaijie, Liu, Xiao, Gong, Yeyun, Huang, Shujian, Chen, Jiajun
Multilingual Alignment is an effective and representative paradigm to enhance LLMs' multilingual capabilities, which transfers the capabilities from the high-resource languages to the low-resource languages. Meanwhile, some research on language-specific neurons provides a new perspective to analyze and understand LLMs' mechanisms. However, we find that there are many neurons that are shared by multiple but not all languages and cannot be correctly classified. In this work, we propose a ternary classification methodology that categorizes neurons into three types, including language-specific neurons, language-related neurons, and general neurons. And we propose a corresponding identification algorithm to distinguish these different types of neurons. Furthermore, based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts: (1) multilingual understanding, (2) shared semantic space reasoning, (3) multilingual output space transformation, and (4) vocabulary space outputting. Additionally, we systematically analyze the models before and after alignment with a focus on different types of neurons. We also analyze the phenomenon of ''Spontaneous Multilingual Alignment''. Overall, our work conducts a comprehensive investigation based on different types of neurons, providing empirical results and valuable insights to better understand multilingual alignment and multilingual capabilities of LLMs.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Fujian Province > Fuzhou (0.05)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Greece (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Iowa (0.04)
- North America > Canada (0.04)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Health & Medicine (1.00)